智能论文笔记

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Solving Elliptic Problems with Singular Sources using Singularity Splitting Deep Ritz Method

Tianhao Hu , Bangti Jin , Zhi Zhou

分类：机器学习

2022-09-07

在这项工作中，我们开发了一个有效的求解器，该求解器基于泊松方程的深神经网络，具有可变系数和由Dirac Delta函数$ \ delta（\ Mathbf {x}）$表示的可变系数和单数来源。这类问题涵盖了一般点源，线路源和点线组合，并且具有广泛的实际应用。所提出的方法是基于将真实溶液分解为一个单一部分，该部分使用拉普拉斯方程的基本解决方案在分析上以分析性的方式，以及一个正常零件，该零件满足适合的椭圆形PDE，并使用更平滑的来源，然后使用深层求解常规零件，然后使用深层零件来求解。丽兹法。建议提出遵守路径遵循的策略来选择罚款参数以惩罚Dirichlet边界条件。提出了具有点源，线源或其组合的两维空间和多维空间中的广泛数值实验，以说明所提出的方法的效率，并提供了一些现有方法的比较研究，这清楚地表明了其竞争力的竞争力具体的问题类别。此外，我们简要讨论该方法的误差分析。

translated by 谷歌翻译

Antecedent Predictions Are Dominant for Tree-Based Code Generation

Yihong Dong , Ge Li , Zhi Jin

分类：人工智能

2022-08-22

代码生成的重点是将自然语言（NL）话语自动转换为代码段。序列对树（Seq2Tree）方法，例如Tranx，是为代码生成的，并保证了生成的代码的编译性，该代码的编译性会生成随后的抽象语法树（AST）节点，该节点依赖于AST节点的前提预测。现有的SEQ2TREE方法倾向于同时对待前期预测和后续预测。但是，在AST约束下，SEQ2TREE模型很难基于不正确的先决预测产生正确的后续预测。因此，与后续预测相比，先行预测应该受到更多的关注。为此，在本文中，我们提出了一种有效的方法，称为aptranx（先行优先级Tranx），基于Tranx。 APTRANX包含了先行优先级（AP）损失，该损失通过利用生成的AST节点的位置信息来帮助模型对先行预测的重要性。凭借更好的先行预测和随后的预测，Aptranx显着提高了性能。我们在几个基准数据集上进行了广泛的实验，实验结果证明了我们所提出的方法与最新方法相比的优势和普遍性。

translated by 谷歌翻译

A Tree-structured Transformer for Program Representation Learning

Wenhan Wang , Kechi Zhang , Ge Li , Shangqing Liu , Zhi Jin , Yang Liu

分类：机器学习

2022-08-18

当使用深度学习技术对程序语言进行建模时，广泛采用了带有树或图形结构的神经网络，以捕获程序抽象语法树（AST）中的丰富结构信息。但是，计划中广泛存在长期/全球依赖性，大多数这些神经体系结构无法捕获这些依赖性。在本文中，我们提出了Tree-Transformer，这是一种新型的递归树结构神经网络，旨在克服上述局限性。树转化器利用两个多头注意单元来建模兄弟姐妹和父子节点对之间的依赖关系。此外，我们提出了一个双向传播策略，以允许节点信息向两个方向传递：沿树木的自下而上和自上而下。通过结合自下而上和自上而下的传播，树转化器可以同时学习全局上下文和有意义的节点特征。广泛的实验结果表明，我们的树转换器在具有树级和节点级别的预测任务中，在与程序相关的任务中优于现有基于树或基于图的神经网络，这表明Tree-Transformer在学习两个树级时都表现良好和节点级表示。

translated by 谷歌翻译

Underwater Ranker: Learn Which Is Better and How to Be Better

Chunle Guo , Ruiqi Wu , Xin Jin , Linghao Han , Zhi Chai , Weidong Zhang , Chongyi Li

分类：计算机视觉

2022-08-14

在本文中，我们提出了一种基于排名的水下图像质量评估（UIQA）方法，该方法缩写为Uranker。乌兰克（Uranker）建立在高效的注意力图像变压器上。在水下图像方面，我们特别设计（1）直方图嵌入了水下图像作为直方图表的颜色分布以参加全局降解，以及（2）与模型局部降解的动态跨尺度对应关系。最终预测取决于不同量表的类代币，该标记是全面考虑多尺度依赖性的。随着保证金排名损失，我们的乌员可以根据其视觉质量通过不同的水下图像增强（UIE）算法来准确对同一场景的水下图像的顺序进行排名。为此，我们还贡献了一个数据集，即Urankerset，其中包含不同的UIE算法和相应的感知排名增强的足够结果，以训练我们的uranker。除了Uranker的良好表现外，我们发现一个简单的U-Shape UIE网络与我们的预训练的Uranker相结合时可以获得有希望的性能。此外，我们还提出了一个标准化尾巴，可以显着提高UIE网络的性能。广泛的实验证明了我们方法的最新性能。讨论了我们方法的关键设计。我们将发布我们的数据集和代码。

translated by 谷歌翻译

What does Transformer learn about source code?

Kechi Zhang , Ge Li , Zhi Jin

分类：人工智能

2022-07-18

在源代码处理的领域中，基于变压器的表示模型表现出强大的功能，并在许多任务中都实现了最先进的（SOTA）性能。尽管变压器模型处理了顺序源代码，但证据表明，它们也可以捕获结构信息（\ eg，在语法树，数据流，控制流，\等）。我们提出了汇总的注意力评分，这是一种研究变压器学到的结构信息的方法。我们还提出了汇总的注意图，这是一种从预训练模型中提取程序图的新方法。我们从多个角度测量我们的方法。此外，根据我们的经验发现，我们使用自动提取的图形来替换可变滥用任务中那些巧妙的手动设计图。实验结果表明，我们自动提取的语义图非常有意义且有效，这为我们提供了一个新的观点，可以理解和使用模型中包含的信息。

translated by 谷歌翻译

STVGFormer: Spatio-Temporal Video Grounding with Static-Dynamic Cross-Modal Understanding

Zihang Lin , Chaolei Tan , Jian-Fang Hu , Zhi Jin , Tiancai Ye , Wei-Shi Zheng

分类：计算机视觉

2022-07-06

在这份技术报告中，我们将解决方案介绍给以人为中心的时空视频接地任务。我们提出了一个名为stvgformer的简洁有效框架，该框架将时空视觉语言依赖性与静态分支和动态分支建模。静态分支在单个帧中执行交叉模式的理解，并根据框架内视觉提示（如对象出现）学会在空间上定位目标对象。动态分支在多个帧上执行交叉模式理解。它学会了根据动作（如动作）的动态视觉提示来预测目标力矩的开始和结束时间。静态分支和动态分支均设计为跨模式变压器。我们进一步设计了一种新型的静态动力相互作用块，以使静态和动态分支相互传递有用和互补信息，这被证明可以有效地改善对硬病例的预测。我们提出的方法获得了39.6％的VIOU，并在第四人中挑战中获得了HC-STVG曲目的第一名。

translated by 谷歌翻译

Precise Learning of Source Code Contextual Semantics via Hierarchical Dependence Structure and Graph Attention Networks

Zhehao Zhao , Bo Yang , Ge Li , Huai Liu , Zhi Jin

分类：机器学习

2021-11-20

深度学习在各种软件工程任务中广泛使用，例如，节目分类和缺陷预测。虽然该技术消除了特征工程所需的过程，但源代码模型的构建显着影响了这些任务的性能。最近的作品主要集中在通过引入从CFG提取的上下文依赖项来补充基于AST的源代码模型。但是，所有这些都关注基本块的表示，这是上下文依赖性的基础。在本文中，我们集成了AST和CFG，并提出了一种嵌入了分层依赖项的新型源代码模型。基于此，我们还设计了一种神经网络，这取决于图表关注机制。特殊地，我们介绍了基本块的句法结构，即其对应的AST，在源代码模型中提供足够的信息并填补间隙。我们在三种实际软件工程任务中评估了该模型，并将其与其他最先进的方法进行了比较。结果表明，我们的模型可以显着提高性能。例如，与最佳性能的基线相比，我们的模型将参数的比例降低了50 \％并实现了对程序分类任务的准确性的4 \％改进。

translated by 谷歌翻译

Risk-Averse MDPs under Reward Ambiguity

Haolin Ruan , Zhi Chen , Chin Pang Ho

分类：机器学习

2023-01-03

We propose a distributionally robust return-risk model for Markov decision processes (MDPs) under risk and reward ambiguity. The proposed model optimizes the weighted average of mean and percentile performances, and it covers the distributionally robust MDPs and the distributionally robust chance-constrained MDPs (both under reward ambiguity) as special cases. By considering that the unknown reward distribution lies in a Wasserstein ambiguity set, we derive the tractable reformulation for our model. In particular, we show that that the return-risk model can also account for risk from uncertain transition kernel when one only seeks deterministic policies, and that a distributionally robust MDP under the percentile criterion can be reformulated as its nominal counterpart at an adjusted risk level. A scalable first-order algorithm is designed to solve large-scale problems, and we demonstrate the advantages of our proposed model and algorithm through numerical experiments.

translated by 谷歌翻译

Spectral Bandwidth Recovery of Optical Coherence Tomography Images using Deep Learning

Timothy T. Yu , Da Ma , Jayden Cole , Myeong Jin Ju , Mirza F. Beg , Marinko V. Sarunic

分类：人工智能 | 计算机视觉

2023-01-02

Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subsampled OCT data and more recently, deep-learning-based methods have been explored. In this study, we simulate reduced axial scan (A-scan) resolution by Gaussian windowing in the spectral domain and investigate the use of a learning-based approach for image feature reconstruction. In anticipation of the reduced resolution that accompanies wide-field OCT systems, we build upon super-resolution techniques to explore methods to better aid clinicians in their decision-making to improve patient outcomes, by reconstructing lost features using a pixel-to-pixel approach with an altered super-resolution generative adversarial network (SRGAN) architecture.

translated by 谷歌翻译